Convolutional neural network model by deep learning and teaching robot in keyboard musical instrument teaching

Keyboard instruments play a significant role in the music teaching process, providing students with an enjoyable musical experience while enhancing their music literacy. This study aims to investigate the current state of keyboard instrument teaching in preschool education, identify existing challenges, and propose potential solutions using the literature review method. In response to identified shortcomings, this paper proposes integrating intelligent technology and subject teaching through the application of teaching robots in keyboard instrument education. Specifically, a Convolutional Neural Network model of Deep Learning is employed for system debugging, enabling the teaching robot to analyze students’ images and movements during musical instrument play and deliver targeted teaching. Feedback from students who participated in keyboard instrument teaching with the robot indicates high satisfaction levels. This paper aims to diversify keyboard instruments’ teaching mode, introduce the practical application of robots in classroom teaching, and facilitate personalized teaching catering to individual students’ aptitudes.


Introduction
As the economy continues to develop, people's expectations for a higher quality of life led to a more diverse demand for music.Keyboard instrument teaching in colleges and universities is vital in China's music education landscape.Educational reforms hold irreplaceable significance for the long-term development of music education [1].Therefore, music educators in colleges and universities are encouraged to reflect on the existing shortcomings in keyboard instrument teaching.Drawing on their insights and experience, this paper proposes innovative measures.The goal is to foster comprehensive and high-quality keyboard instrument teaching in colleges and universities, leading to a richer learning experience for students.Recent years have witnessed changes in the goals of music education at the collegiate level, focusing on novel teaching methods and models to meet the evolving demands of music learners [2].
The advancement of Artificial Intelligence, computer, and related has led to a surge in research focused on intelligent robots.As a result, numerous colleges and universities are incorporating teaching robots into their educational programs.These intelligent robots are gradually finding their way onto campuses, contributing significantly to intelligent services, smart campuses, and science and technology classes [3].Among the various neural networks utilized in Deep Learning, Convolutional Neural Network (CNN) stands out as one of the most influential and transformative models, particularly in image analysis and processing.This paper explores the application of CNN in image preprocessing, where teaching robots can directly process the original image inputs to enhance educational experiences [4].
This paper summarizes the current mode of keyboard instrument teaching through a literature review and identifies existing shortcomings.The integration of teaching robots in keyboard instrument education by incorporating the CNN model under deep learning.Through the analysis of students' images and actions, while playing the instrument, personalized teaching approaches are developed.The study investigates the performance of the CNN-based teaching robot in keyboard instrument instruction, revealing its high accuracy in image analysis and action recognition.These findings validate the potential for multiple teaching optimizations within the domain of keyboard instrument education.
Although previous research on using deep CNNs for music teaching exists, this paper introduces a novel instructional robot that leverages a large-scale piano performance dataset comprising various musical pieces, difficulty levels, and playing styles.The expansion of the dataset's size and diversity significantly improves the model's generalization ability and adaptability.As a result, the instructional robot becomes more effective in a broader range of music teaching scenarios.These innovations enhance the educational value and practicality of the instructional robot in keyboard instrument teaching and are expected to positively impact the field of music education.However, further in-depth experiments and optimizations are necessary to address potential challenges and advance the application and development of instructional robots.Continued research and improvements will be essential to unlock the full potential of these technologies in music education.

Literature review
The research on keyboard instrument teaching encompasses various perspectives and innovative ideas to enhance teaching practices and promote educational reform.Marcelo's work focuses on curriculum design, exploring methods to promote teaching development based on field research [5].Gorgoretti put forward three teaching reform strategies to enrich music teaching in preschool education, including creating a conducive teaching environment, enriching teaching content, and fostering a stronger connection with society [6].Fang advocates for the effective educational function of keyboard and musical instruments, emphasizing the importance of systematic reform and healthy art education in line with ecological civilization [7].Overall, these studies shed light on the significance of keyboard instrument teaching in promoting students' learning and fostering educational reform.
This paper provides a comprehensive overview of studies of intelligent robots in education, particularly focusing on their application in keyboard instrument teaching.Yang's study examines robot-assisted instruction as the research object, delving into the theoretical and practical aspects.They explore innovative ideas and schemes to incorporate robots into classroom teaching practices to enhance the effectiveness of robot-assisted instruction [8].Lee's work revolves around the hardware and software design of a teaching robot.By employing the Hopfield Artificial Neural Network (ANN), they optimize the robot's motion trajectory and validate the results through simulation [9].Gorbunova's research involves the implementation of the Fuzzy Neural Network algorithm to address the nonlinear characteristics of the teaching robot's arm motion.The study describes the implementation method of the control algorithm and demonstrates improved control performance through simulation technology [10].Overall, these studies contribute to the research and exploration of intelligent robots in education, specifically in the context of keyboard instrument teaching.They offer practical insights and use simulation techniques combined with ANNs to prove the feasibility and potential benefits of integrating intelligent robots into keyboard teaching research.
Relevant scholars have researched the current teaching practices and the potential use of teaching robots.Presently, the teaching process in keyboard instrument education is often perceived as monotonous, with a strong emphasis on skill learning but lacking in engaging and dynamic elements.Consequently, there is a pressing need for innovative teaching models that can captivate students' interest and enhance their learning experience.After studying the performance characteristics of teaching robots, This paper addresses this issue by exploring the performance characteristics of teaching robots and their potential application in keyboard instrument instruction.By leveraging the CNN model under deep learning, the paper investigates students' image analysis and actions while playing the keyboard instrument.Through this approach, the teacher can better understand individual students' playing challenges and tailor their teaching methods accordingly to foster a more engaging classroom environment.As a result, students have the opportunity to develop their music-playing skills and musicreading abilities more effectively.Ultimately, the integration of the CNN model with keyboard instrument teaching has the potential to optimize the overall teaching quality and create a more enriched learning experience for students.

Teaching status of keyboard instruments
The teaching of keyboard instruments holds significant importance in college preschool education, where students are required to develop five essential skills: playing, singing, jumping, speaking, and drawing [11].However, instructing keyboard instruments has presented challenges.Most newcomers lack prior knowledge of musical staff notation, have a limited understanding of keyboard instruments, or may not have any experience with playing the piano or electronic organ.As a vital component of the music teaching process, keyboard instruments offer students the opportunity to appreciate music while enhancing their musical literacy.Therefore, it is essential to focus on the current state of keyboard instrument education in preschool programs [12].At present, there are various disadvantages in keyboard instrument teaching, as illustrated in Fig 1.
Despite the implementation of a student-centered view in education, the teacher-based teaching mode remains the dominant approach in the teaching process.As a consequence, students must follow the teacher's lead and adhere to prescribed classroom learning procedures.Regrettably, the full implementation of a student-centered model in the teaching process is lacking, resulting in a gradual decline in students' motivation and interest in music learning [13].Furthermore, many teachers adopt a conservatory-style education method that neglects training students in playing and sound skills, ignoring the cultivation of students' preschool education abilities.Preschool music education exhibits vast differences in direction and content among different schools, leading to a significant disparity in difficulty levels and a lack of meaningful guidance for teaching practices [14].Therefore, the transformation of teaching ideas in future keyboard instrument teaching is critical, as depicted in Fig 2: In teaching keyboard instruments for preschool education majors, it is crucial for teachers to understand the individual circumstances of each student.This understanding allows teachers to tailor their teaching resources and conditions to suit the specific needs of each student.In this way, teachers can effectively assist students in mastering staff notation, comprehending proper piano playing techniques, and proficiently identifying various timbres in musical scores.Additionally, teachers should adapt their teaching approach based on students' learning preferences and goals, ensuring continuous improvement in their performance and performance skills [15].
The essence of keyboard and instrument art education lies in the cultivation of music artistry and human civilization.Therefore, in teaching keyboard instruments, students should combine "playing" with "music" to enhance the effectiveness of their performance.It is crucial for students to skillfully master playing, deeply comprehend the historical context and artistic style of musical compositions, and internalize the melody of the music through singing.This approach enables students to recreate the ideas of composers, embody national musical styles, and depict life experiences, thereby solidifying their proficiency in playing keyboard instruments [16].

CNN model based on deep learning
In recent years, deep learning has gained widespread application in image target recognition, speech recognition, and natural language processing.As a cutting-edge technology in machine learning algorithms, deep learning emulates the human brain's analysis and learning processes through neural networks, significantly advancing the field of machine learning [17].At present, Deep Neural Networks represent the dominant form of deep learning, with Deep Convolutional Neural Networks (DCNN) standing out as classic and extensively used structures [18].CNN is a kind of neural network with a deep structure and convolution computation.It exhibits distinctive features such as weight sharing, local connection, and convolution pooling.These attributes endow CNN with superior performance in various signal and informationprocessing tasks compared to fully connected neural networks [19].The structural arrangement of CNN is depicted in Fig 3 : In the convolution layer, the size and depth of the convolution kernel are manually specified.Then, the learnable convolution is applied to the feature map of the preceding layer for convolution.During initialization, the program generates weight parameters randomly and iteratively optimizes them to achieve the optimal classification results and output feature maps.Following convolution, the pooling layer is generally incorporated to reduce the number of features, select relevant features, and diminish the learning parameters of the fully connected layer.The role of the fully connected layer in the CNN is to serve as the "classifier", responsible for the output expression of feature extraction.The final output feature map functions as the input feature vector of the fully connected layer, with its dimensions equaling the number of network nodes of the last output feature map layer.Classification and recognition are performed in the fully connected layer during the training of the classifier model [20].
One-dimensional or two-dimensional convolutions are employed to process signals or images.Based on different convolution inner product calculations, convolution integrals encompass positive-order and reverse-order convolutions [21].Consider an image X2R M×N and a convolution kernel W2R U×V , where U�M, and V�N.The positive-order convolution of W and X is defined as shown in Eq (1).
In Eq (1), w uv and x iþuÀ 1;jþvÀ 1 represent elements in W and X, respectively.The positiveorder convolution operation of image X and convolution kernel W is recorded as Y = W X. The definition of inverse convolution of W and X is shown in Eq (2).
The number of input neurons in one convolution layer is N, the convolution size is K, and the step size is S. Zero padding is used, and Q zeros are filled at both ends.Then, the number of neurons output by the convolution layer is calculated according to Eq (3).
Assuming X M×N×D is the input feature mapping group of the pooling layer, where each feature Here, x i represents the value of each neuron in the specified area.Pooling involves down-sampling each region and getting a value that summarizes the information in that area.This process includes maximum and average pooling techniques [22].Firstly, in the process of maximum pooling, the maximum number within the specified region R d m;n is selected to represent the entire region, as shown in Eq (4).
In Eq (4), x i represents the value of each neuron in the specified area.Average pooling selects the average value of the specified region R d m;n to represent the entire region, as shown in Eq (5).
The input feature map X d is subsampled into M 0 ×N 0 regions, and the eigenvalues representing each region are obtained.The output mapping of the pooling layer is obtained as a matrix Y d , represented by Eq (6).
This paper examines the functionality of the convolutional accelerator by implementing LeNet-5 for handwritten digit recognition.Initially, the network structure of LeNet-5 is introduced.Subsequently, the process of building the LeNet-5 network structure using the accelerator in the Python environment is described.The collected data is processed, and the accuracy of classifying the handwritten digit test set images is evaluated.Furthermore, a comparison is made between the inference time and power consumption during the prediction process.The architecture of the LeNet-5 CNN is depicted in Fig 5.
The LeNet-5 model is a classic and simple CNN consisting of 2 convolutional layers, 2 subsampling layers, and 2 fully connected layers.The specific structure is illustrated in Fig 7 .The input layer of LeNet-5 accepts images of size 32x32 pixels.The first convolutional layer (C1) employs 6 convolutional kernels of size 5x5, with a sliding stride of one pixel.After the convolution operation, 6 feature maps of size 28x28 are obtained.The subsampling layer (S2) utilizes 2x2 max-pooling filters for downsampling, resulting in 6 feature maps of size 14x14, reducing the feature map size to one-fourth of the original size.The second convolutional layer (C3) incorporates 16 convolutional kernels of size 5x5, sliding one pixel at a time, leading to 16 feature maps of size 10x10.The subsampling layer (S4) also adopts 2x2 max-pooling filters for downsampling, generating 16 feature maps of size 5x5.The outputs are then passed through the fully connected layer (C5) to obtain 120 outputs, which are further processed by another fully connected layer (F6) to produce 10 outputs.The final prediction is determined based on the size of these 10 outputs.

Application of depth-based CNN teaching robot in keyboard and instrument teaching
In the robot-assisted teaching system, robots are introduced as additional components, along with the four core elements of teacher, learner, teaching content, and media, to create an integrated robot-assisted instruction system [23].The system involves the teacher's pre-assembles and debugging of the robot teaching system, as well the storage of teaching content and media within the robot.The robot assists the teacher in controlling the presentation of information through the teaching media to the learners.Additionally, the robot gathers feedback according to the learners' learning progress.In case students encounter difficulties, they can seek help from the teacher, who maintains full control over the whole teaching process, providing targeted guidance to the learners [24].Fig 6 illustrates the overall framework of the control system in the robot-assisted teaching system.
This design incorporates a Visual Basic (VB) Human-Computer Interface to effectively control the teaching robot.Based on the control requirements of the system and the characteristics of various controls provided by VB, the operation interface is structured into two distinct parts: the main and sub-windows [25].The main window is responsible for several essential functions, including parameter input, control mode switching, command transmission, subwindow invocation, and real-time display of system status.The sub-windows mainly serve auxiliary functions, such as system settings and data modifications [26].The main operation window is meticulously designed to cater to the control requirements, main program flow, and parameter input order.Within the main operation interface, the system can be roughly divided into six modules, as illustrated in Fig 7.
When provided with the initial and end positions of the teaching robot, a collision-free path is determined for the mobile robot within the permissible range of generalized coordinates for movement.In scenarios where multiple locating points are set, and the manipulator must reach the point sequentially, an optimal path is sought to minimize the manipulator's movement distance and achieve the shortest possible time [27].The mode of classroom teaching and image acquisition employed by the teaching robot is depicted in Fig 8.
The teaching robot is characterized by its high flexibility, facilitated by its ability to utilize various sensors for the input and output of diverse teaching information.Its mobility is dependent on its versatile mobile morphological structure, allowing it to move freely within the classroom.The teaching robot effectively fulfills classroom teaching tasks akin to a human teacher.The effectiveness of the teaching method by such robots relies on the number of functions and the level of intelligence integrated into their design.Typically, robot teaching necessitates a comprehensive approach encompassing six key aspects: robot control system, voice interaction, visual capabilities, motion control, autonomous navigation, and multimedia functionality [28].
The keyboard instrument teaching robot, utilizing CNNs, offers students a comprehensive and efficient platform for learning keyboard instruments.It incorporates features such as real-    The central function of the teaching robot is real-time performance assessment.As the student plays a musical piece on the keyboard instrument, the model continuously monitors the performance, offering assessment and feedback according to predetermined accuracy and performance criteria.This immediate feedback empowers students to promptly comprehend their performance during practice sessions and enhance their playing techniques accordingly.
Moreover, the teaching robot facilitates music learning assistance, a significant feature for students.Students can choose specific musical pieces for learning, and the robot provides invaluable practice support, including a step-by-step breakdown of the musical piece, guidance on practicing challenging sections, and assistance in better comprehending and mastering the music.Notably, the robot offers real-time feedback and meticulously tracks the students' learning progress and performance.Personalized learning suggestions and challenges are provided based on their playing performance to foster continuous improvement throughout the learning process.

Datasets collection
This paper investigates preschool education students in colleges and universities.The implementation of the DCNN teaching robot is observed over a three-month period.Following the principle of voluntariness, preschool education students from four universities in a city are concurrently selected for participation.The locations chosen for the study include areas with high foot traffic, such as classroom entrances and dormitories specifically designated for preschool majors.Prior to and after the introduction of the teaching robot, a questionnaire survey is conducted to gather data.Besides, periodic observations are made on the interactive effects of the teaching robot in the classroom setting.Subsequently, changes in students' learning status are compared with the outcomes of intelligent robot-based teaching practices.Before completing the questionnaire, students are informed about the study's objectives and assured of its anonymous nature, thereby ensuring no impact on their academic pursuits or personal lives.Students are encouraged to provide truthful responses, as their authentic first-hand information serves as invaluable research data.
This paper encompasses four schools and involves preschool majors across various grades.A total of 400 questionnaires were distributed, out of which 390 were collected.After filtering for validity, 385 questionnaires were deemed suitable for analysis, resulting in an effective rate of 96%.Table 1 lists the specific content configuration of the questionnaire.

Experimental environment
The questionnaire survey was conducted within the school premises and targeted preschool education students from four universities in a specific city.To facilitate the study, the teaching robot operated on a CNN model developed using deep learning on a terminal browser.
Remarkably, the robot could be controlled without the need for any signal, network, or physical connection.The development platform utilized a web-based application, offering interactive computing capabilities for tasks such as coding, document preparation, and result visualization.Notably, the platform allowed real-time display of video data captured by the teaching robot's camera on web pages.

Parameters setting
The entire questionnaire comprises four sections: basic information, keyboard learning basis, evaluation of teaching methods, and the effect of teaching robot-assisted instruction.To ensure the reliability and validity of the questionnaire are tested to verify the rationality and reliability of research data, the questionnaire underwent testing.Cronbach's α was employed to assess its reliability, with a coefficient value exceeding 0.8 signifying high reliability, a value between 0.7 and 0.8 indicating good reliability, and a value less than 0.6 suggesting poor reliability.In this paper, the questionnaire exhibited a reliability coefficient value of 0.903, indicating high reliability.Therefore, the research data can be deemed dependable and suitable for subsequent analysis.
The reliability evaluation results of the questionnaire are shown in Table 2.

Performance evaluation
Teachers hold the vital role of educators and leaders in the teaching process.Most students generally lack As music education is often limited among students, they heavily rely on teachers' demonstrations and guidance to learn keyboard instruments.The questionnaire is thoughtfully constructed to assess teachers' teaching attitudes, abilities, and students' satisfaction with teachers.The evaluation also encompasses students' perceptions of teachers' conscientiousness and responsibility during classes, as presented in Fig 11.In Fig 11, the data exhibits considerable variability, particularly in the internal evaluation of School C, which shows a notably positive trend.Approximately 80% of the students from School C reported that their keyboard teachers demonstrate seriousness and responsibility, with only 3% expressing disagreement with this statement.This result suggests that School C's teaching approach is well-regarded by its students.However, School A's data on the same issue is surprising, with 45% of its students expressing dissatisfaction with their keyboard teachers'  In Fig 12, the data reveals a high level of satisfaction among most students of preschool education majors regarding the auxiliary teaching robot's contribution to keyboard teaching in their respective schools, with the highest satisfaction data reaching 96%.Nevertheless, it is essential to note that students from different schools hold varying views, leading to significant differences in satisfaction levels.Remarkably, students from School C exhibit the highest level of satisfaction with the overall evaluation of keyboard teaching, with 73% of respondents expressing utmost satisfaction and no students expressing any dissatisfaction.Schools B and D Based on the data analysis presented in Fig 13, it is evident that the teaching robot model, employing the deep CNN, achieves highly efficient performance with a forward inference time of approximately 0.186 seconds.Moreover, the recognition accuracy using the CNN dataset remains consistently high, reaching around 98% across various prediction sample conditions, which fulfills the system design requirements.

Discussion
In the study by Alvarez Dionisi, the development platform for the biped teaching robot's application scheme was designed, enabling posture and walking control.The robot used a camera to collect handwritten digital character pictures from its front field of vision, which were then input to the CNN accelerator for forward prediction.The research demonstrated that the biped teaching robot development platform effectively incorporates mechanical control, neural network learning, image processing, and other functionalities.Besides, it exhibits high universality and openness, meeting the learning requirements of multidisciplinary crossintegration [29].Griffith identified three critical issues in current keyboard teaching: students' difficulties in learning keyboard instruments, low professional quality among students, and inadequate teaching quality.Four primary reasons contribute to these challenges, including students' low comprehensive abilities, teachers' limitations, insufficient teaching facilities, and unreasonable curricula.To address these issues, Griffith proposed improving teaching quality through improvements in schools, teachers, students, and other aspects.The author emphasized the importance of enriching teaching methodologies and improving students' music cultivation, as well as their keyboard and instrument-playing skills [30].Based on the DCNN model and considering the operating mechanism of the teaching robot, this paper leverages shape and image information during students' musical instrument play.It identifies action errors and implements targeted teaching for individual students.This innovative approach establishes a new teaching mode wherein teachers lead and robots provide guidance.Consequently, the teaching of keyboard and instrument skills becomes more diversified, enabling the application and cooperation of robots in classroom settings and facilitating personalized teaching to cater to students' aptitudes and abilities.The teaching robot proposed in this paper exhibits remarkable capabilities, encompassing real-time monitoring of students' performances and delivering precise evaluations and feedback.Additionally, the robot provides personalized learning recommendations and challenges based on individual performance, a novel feature compared to conventional music teaching methods.This instantaneous and tailored feedback and assistance serve as a notable innovation, enabling students to promptly identify areas for improvement, refine their skills, and enhance their performance more efficiently.

Conclusion
The teaching robot model developed here holds promising potential for significantly impacting and improving keyboard instrument education.The extensive collection of keyboard instrument performance data and its conversion into musical note sequences have laid a solid data foundation for the model.By adopting the deep CNN LeNet-5 as the model architecture, the teaching robot effectively captures local patterns and features in music, allowing for more sophisticated musical expressions to be learned.
The real-time performance assessment function offers students the advantage of promptly understanding their playing performance and receiving accurate feedback during practice sessions.Moreover, the music learning assistance function provides valuable support to students by systematically breaking down musical pieces and guiding them through challenging sections, thereby enhancing their comprehension and mastery of the music.These features demonstrate the teaching robot's efficacy in advancing students' keyboard instrument learning experiences.
The research findings indicate that the teaching robot's real-time feedback and learning progress tracking functionalities significantly contribute to students' learning outcomes.The timely feedback allows students to make swift adjustments to their playing techniques, leading to continuous improvements in their performance levels.Moreover, the teaching robot offers personalized learning advice and challenges tailored to each student's playing performance, effectively fostering their interest and motivation in the learning process.The model's performance testing demonstrates that the teaching robot, based on deep learning CNNs, achieves an optimal forward inference time of 0.186 seconds.Moreover, under various prediction sample conditions, the recognition accuracy using the CNN dataset reaches an impressive rate of approximately 98%, which aligns well with the requirements of the system design.These results reaffirm the teaching robot's effectiveness and reliability in supporting students' musical development and enhancing their learning experiences.
In summary, the keyboard instrument teaching robot developed here utilizing deep CNNs offers a comprehensive and effective platform for keyboard instrument learning.Its range of functionalities, including real-time performance evaluation, music learning assistance, instant feedback, and learning progress tracking, has substantially positively impacted students' keyboard instrument learning endeavors.Consequently, this teaching robot represents a valuable and innovative tool in the domain of music education.However, certain challenges related to computational efficiency, resource utilization, and user experience must be carefully addressed for broader adoption and improved practicality of this model.These aspects are critical for ensuring a seamless and efficient implementation of the teaching robot in real-world scenarios.By addressing these challenges, the teaching robot has the potential to become a transformative and widely embraced addition to the realm of music education, fostering enhanced learning experiences and achievements among keyboard instrument learners.